Topic-Level Random Walk through Probabilistic Model

نویسندگان

  • Zi Yang
  • Jie Tang
  • Jing Zhang
  • Juan-Zi Li
  • Bo Gao
چکیده

In this paper, we study the problem of topic-level random walk, which concerns the random walk at the topic level. Previously, several related works such as topic sensitive page rank have been conducted. However, topics in these methods were predefined, which makes the methods inapplicable to different domains. In this paper, we propose a four-step approach for topic-level random walk. We employ a probabilistic topic model to automatically extract topics from documents. Then we perform the random walk at the topic level. We also propose an approach to model topics of the query and then combine the random walk ranking score with the relevance score based on the modeling results. Experimental results on a real-world data set show that our proposed approach can significantly outperform the baseline methods of using language model and that of using traditional PageRank.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجره‌های هم‌پوشان

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

متن کامل

Intra-Speaker Topic Modeling for Improved Multi-Party Meeting Summarization with Integrated Random Walk

This paper proposes an improved approach to extractive summarization of spoken multi-party interaction, in which integrated random walk is performed on a graph constructed on topical/ lexical relations. Each utterance is represented as a node of the graph, and the edges’ weights are computed from the topical similarity between the utterances, evaluated using probabilistic latent semantic analys...

متن کامل

Grounding Topic Models with Knowledge Bases

Topic models represent latent topics as probability distributions over words which can be hard to interpret due to the lack of grounded semantics. In this paper, we propose a structured topic representation based on an entity taxonomy from a knowledge base. A probabilistic model is developed to infer both hidden topics and entities from text corpora. Each topic is equipped with a random walk ov...

متن کامل

Incrementalizing MCMC in Probabilistic Programs Through Tracing and Slicing

Probabilistic programming enables high-level specification of complex probabilistic models without the need for hand-built inference techniques. We present a dynamic compilation technique for increasing the efficiency of Markov Chain Monte Carlo (MCMC) inference on probabilistic programs. We focus on Church, a probabilistic extension of untyped call-by-value lambda calculus. MCMC in this settin...

متن کامل

A Novel Topic-Level Random Walk Framework for Scene Image Co-segmentation

Image co-segmentation is popular with its ability to detour supervisory data by exploiting the common information in multiple images. In this paper, we aim at a more challenging branch called scene image co-segmentation, which jointly segments multiple images captured from the same scene into regions corresponding to their respective classes. We first put forward a novel representation named Vi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009